-
Notifications
You must be signed in to change notification settings - Fork 101
feat: Design of EstimatorReport
#997
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
| @@ -1,9 +1,11 @@ | |||
| """Enhance `sklearn` functions.""" | |||
|
|
|||
| from skore.sklearn._estimator import EstimatorReport | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's disturbing that you want to expose something from a private/protected module.
Shouldn't skore.sklearn.estimator be exposed too by removing _?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Basically, I want the user to be able to do
skore.EstimatorReportor
skore.sklean.EstimatorReportbut I don't want to expose in a lower level. In scikit-learn (and other package), whenever you don't want people to import from the private module, you add an _ even if it is a folder.
For instance, I would probably to the same for cross_validation.
However, it is something that we can discuss later.
skore/tests/conftest.py
Outdated
| """Setup and teardown fixture for matplotlib. | ||
| This fixture checks if we can import matplotlib. If not, the tests will be | ||
| skipped. Otherwise, we close the figures before and after running the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fmi, why closing before, not just after?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a definitive answer since I did not write in scikit-learn. What I can infer is that some test might fail and might not end in the teardown maybe. So the subsequent test is here to make a clean start. However, I'm unsure.
| "estimator[/bold cyan]" | ||
| ) | ||
|
|
||
| def _create_help_tree(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please add to the helper the representation of the attributes of the reporter.
For instance, it can help users to know that the reporter contains the fitted estimator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| ) | ||
| ) | ||
| # trigger the computation | ||
| list( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we could have a list of indeterminated progress instead of one progress bar that "jumps".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Happy to see what we can do to improve the current state.
| @@ -0,0 +1,168 @@ | |||
| from typing import Any, Callable, Literal, Optional, Union | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To-do: check if removing the stub files breaks the auto-completion or not, and check if a work-around exists (ping @augustebaum).
Co-authored-by: Sylvain Combettes <[email protected]>
Co-authored-by: Sylvain Combettes <[email protected]>
Co-authored-by: Sylvain Combettes <[email protected]>
Coverage Report for backend
|
|
OK. It should be good to go and we should be able to iterate. |
sylvaincom
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks for this very useful PR @glemaitre and the whole team for reviewing it! Let's iterate on sub-issues if needed
closes probabl-ai#834 Investigate an API for a `EstimatorReport`. #### TODO - [x] Metrics - [x] handle string metrics has specified in the accessor - [x] handle callable metrics - [x] handle scikit-learn scorers - [x] use efficiently the cache as much as possible - [x] add testing for all of those features - [x] allow to pass new validation set to functions instead of using the internal validation set - [x] add a proper help and rich `__repr__` - [x] Plots - [x] add the roc curve display - [x] add the precision recall curve display - [x] add prediction error display for regressor - [x] make proper testing for those displays - [x] add a proper `__repr__` for those displays - [x] Documentation - [x] (done for the checked part) add an example to showcase all the different features - [x] find a way to show the accessors documentation in the page of `EstimatorReport`. It could be a bit tricky because they are only defined once the instance created. - We need to have a look at the `series.rst` page from pandas to see how they document this sort of pattern. - [x] check the autocompletion: when typing `report.metrics.->tab` it should provide the autocompetion. **edit**: having a stub file is actually working. I prefer this than type hints directly in the file. - Open questions - [x] we use hashing to retrieve external set. - use the caching for the external validation set? To make it work we need to compute the hash of potentially big arrays. This might more costly than making the model predict. #### Notes This PR build upon: - probabl-ai#962 to reuse the `skore.console` - probabl-ai#998 to be able to detect clusterer in a consistent manner.

closes #834
Investigate an API for a
EstimatorReport.TODO
__repr____repr__for those displaysEstimatorReport. It could be a bit tricky because they are only defined once the instance created.series.rstpage from pandas to see how they document this sort of pattern.report.metrics.->tabit should provide the autocompetion. edit: having a stub file is actually working. I prefer this than type hints directly in the file.Notes
This PR build upon:
skore.console